Skip to content

Conversation

@bnestere
Copy link
Contributor

@bnestere bnestere commented Oct 15, 2025

The binary log could be corrupted when committing a large transaction
(i.e. one whose data exceeds the binlog_cache_size limit and spills
into a tmp file) in binlog_format=row if the server's --tmp-dir is
full. The corruption that happens is only the GTID of the errored
transaction would be written into the binary log, without any
body/finalizing events. This would happen because the content of the
transaction wasn't flushed at the proper time, and the transaction's
binlog cache data was not durable while trying to copy the content
from the binlog cache file into the binary log itself. While switching
the tmp file from a WRITE_CACHE to a READ_CACHE, the server would see
there is still data to flush in the cache, and first try to flush it.
This is not a valid time to flush that data to the temporary file
though, as:

  1. The GTID event has already been written directly to the binary
    log. So if this flushing fails, it leaves the binary log in a
    corrupted state.

  2. This is done during group commit, and will slow down other
    concurrent transactions, which are otherwise ready to commit.

This patch fixes these issues by ensuring all transaction data is
fully flushed to its temporary file (if used) before starting any
critical paths, i.e. in binlog_flush_cache(). Note that if the binlog
cache is solely in-memory, this flush-to-temporary-file is skipped.

This PR is organized as follows:
Commit 1: The regression that reproduces the bug report
Commit 2: The fix
And any future commits will be addressing reviews.

@bnestere bnestere added MariaDB Corporation Replication Patches involved in replication labels Oct 15, 2025
@bnestere bnestere force-pushed the 10.11-MDEV-37662 branch 2 times, most recently from 9213ed1 to 72d0eef Compare October 21, 2025 19:54
Copy link
Contributor

@andrelkin andrelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The work looks good. Special thanks for a solid analysis!
I have only a kind of cosmetic notes.

sql/log.cc Outdated
Returns TRUE on success, FALSE on error.
*/
static my_bool binlog_cache_reconcile_data_in_storage(IO_CACHE *info)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new function purpose is clear, but its introduction as static feels an overkill..

sql/log.cc Outdated
ready-to-commit (concurrent) transactions could be stalled
*/
if (using_stmt && !thd->binlog_flush_pending_rows_event(TRUE, FALSE) &&
binlog_cache_reconcile_data_in_storage(
Copy link
Contributor

@andrelkin andrelkin Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. so instead my_b_get_pos_in_file(info) && flush_io_cache(info) would be a fair enough at this point, as a new inline, why not flush_pending_bytes(info); flushin the name, while reads naturally, is also for consistency with the caller and a sibling's names.
Also the suggested name suggests to me (much) shorter comments that your static function's header ones..
Indeed, the whole point of the caller function is to get all stuff to disk. That, as it turns out in this bug, just deals on few levels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the inline change, but the change to the function name I'm still not sold on (though I'm not sold on it as it is either). It isn't guaranteed to flush if the content should all be held in-memory. I'll think on a better function name for a couple days before pushing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush_write_buffer_if_needed() 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's closer, and to finish it off, I think flush_write_buffer_if_file_backed() would make it more specific.

@bnestere bnestere force-pushed the 10.11-MDEV-37662 branch 2 times, most recently from b989750 to 7707e5c Compare November 18, 2025 20:03
@knielsen
Copy link
Member

knielsen commented Nov 19, 2025 via email

The binary log could be corrupted when committing a large transaction
(i.e. one whose data exceeds the binlog_cache_size limit and spills
into a tmp file) in binlog_format=row if the server's --tmp-dir is
full. The corruption that happens is only the GTID of the errored
transaction would be written into the binary log, without any
body/finalizing events.  This would happen because the content of the
transaction wasn't flushed at the proper time, and the transaction's
binlog cache data was not durable while trying to copy the content
from the binlog cache file into the binary log itself. While switching
the tmp file from a WRITE_CACHE to a READ_CACHE, the server would see
there is still data to flush in the cache, and first try to flush it.
This is not a valid time to flush that data to the temporary file
though, as:

  1. The GTID event has already been written directly to the binary
     log. So if this flushing fails, it leaves the binary log in a
     corrupted state.

  2. This is done during group commit, and will slow down other
     concurrent transactions, which are otherwise ready to commit.

This patch fixes these issues by ensuring all transaction data is
fully flushed to its temporary file (if used) before starting any
critical paths, i.e. in binlog_flush_cache(). Note that if the binlog
cache is solely in-memory, this flush-to-temporary-file is skipped.

Reviewed-by: Andrei Elkin <[email protected]>
Signed-off-by: Brandon Nesterenko <[email protected]>
@bnestere bnestere merged commit 0f5e5ce into 10.11 Nov 19, 2025
16 of 17 checks passed
@bnestere bnestere deleted the 10.11-MDEV-37662 branch November 19, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

MariaDB Corporation Replication Patches involved in replication

Development

Successfully merging this pull request may close these issues.

4 participants